Overview

Dataset statistics

Number of variables18
Number of observations5983
Missing cells8618
Missing cells (%)8.0%
Duplicate rows12
Duplicate rows (%)0.2%
Total size in memory841.5 KiB
Average record size in memory144.0 B

Variable types

Numeric8
Categorical10

Alerts

Dataset has 12 (0.2%) duplicate rowsDuplicates
pickup_place has a high cardinality: 898 distinct values High cardinality
place_category has a high cardinality: 57 distinct values High cardinality
item_name has a high cardinality: 2277 distinct values High cardinality
item_category_name has a high cardinality: 767 distinct values High cardinality
how_long_it_took_to_order has a high cardinality: 2170 distinct values High cardinality
when_the_delivery_started has a high cardinality: 4867 distinct values High cardinality
when_the_courier_arrived_at_pickup has a high cardinality: 4446 distinct values High cardinality
when_the_courier_left_pickup has a high cardinality: 4428 distinct values High cardinality
when_the_courier_arrived_at_dropoff has a high cardinality: 4849 distinct values High cardinality
pickup_lat is highly correlated with pickup_lon and 1 other fieldsHigh correlation
pickup_lon is highly correlated with pickup_lat and 1 other fieldsHigh correlation
dropoff_lat is highly correlated with pickup_lat and 1 other fieldsHigh correlation
dropoff_lon is highly correlated with pickup_lon and 1 other fieldsHigh correlation
pickup_lat is highly correlated with pickup_lon and 1 other fieldsHigh correlation
pickup_lon is highly correlated with pickup_lat and 1 other fieldsHigh correlation
dropoff_lat is highly correlated with pickup_lat and 1 other fieldsHigh correlation
dropoff_lon is highly correlated with pickup_lon and 1 other fieldsHigh correlation
pickup_lat is highly correlated with dropoff_latHigh correlation
dropoff_lat is highly correlated with pickup_latHigh correlation
place_category is highly correlated with pickup_lat and 1 other fieldsHigh correlation
pickup_lat is highly correlated with place_category and 3 other fieldsHigh correlation
pickup_lon is highly correlated with place_category and 3 other fieldsHigh correlation
dropoff_lat is highly correlated with pickup_lat and 2 other fieldsHigh correlation
dropoff_lon is highly correlated with pickup_lat and 2 other fieldsHigh correlation
place_category has 883 (14.8%) missing values Missing
item_name has 1230 (20.6%) missing values Missing
item_quantity has 1230 (20.6%) missing values Missing
item_category_name has 1230 (20.6%) missing values Missing
how_long_it_took_to_order has 2945 (49.2%) missing values Missing
when_the_courier_arrived_at_pickup has 550 (9.2%) missing values Missing
when_the_courier_left_pickup has 550 (9.2%) missing values Missing
how_long_it_took_to_order is uniformly distributed Uniform
when_the_delivery_started is uniformly distributed Uniform
when_the_courier_arrived_at_pickup is uniformly distributed Uniform
when_the_courier_left_pickup is uniformly distributed Uniform
when_the_courier_arrived_at_dropoff is uniformly distributed Uniform

Reproduction

Analysis started2022-01-29 20:33:12.747810
Analysis finished2022-01-29 20:33:27.222641
Duration14.47 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

delivery_id
Real number (ℝ≥0)

Distinct5214
Distinct (%)87.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1379495.072
Minimum1271706
Maximum1491424
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2022-01-29T21:33:27.349302image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1271706
5-th percentile1281674
Q11322792.5
median1375689
Q31436371
95-th percentile1480980.2
Maximum1491424
Range219718
Interquartile range (IQR)113578.5

Descriptive statistics

Standard deviation64593.9744
Coefficient of variation (CV)0.04682436039
Kurtosis-1.221358888
Mean1379495.072
Median Absolute Deviation (MAD)56073
Skewness0.08500157938
Sum8253519014
Variance4172381529
MonotonicityNot monotonic
2022-01-29T21:33:27.493772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14602965
 
0.1%
12868264
 
0.1%
14751274
 
0.1%
14354274
 
0.1%
13970324
 
0.1%
12852924
 
0.1%
14824924
 
0.1%
13748034
 
0.1%
14606994
 
0.1%
13199714
 
0.1%
Other values (5204)5942
99.3%
ValueCountFrequency (%)
12717061
< 0.1%
12717511
< 0.1%
12718671
< 0.1%
12722791
< 0.1%
12723031
< 0.1%
12723631
< 0.1%
12723721
< 0.1%
12723821
< 0.1%
12724391
< 0.1%
12724511
< 0.1%
ValueCountFrequency (%)
14914242
< 0.1%
14913411
< 0.1%
14911471
< 0.1%
14911441
< 0.1%
14911101
< 0.1%
14910901
< 0.1%
14910291
< 0.1%
14908931
< 0.1%
14908651
< 0.1%
14908291
< 0.1%

customer_id
Real number (ℝ≥0)

Distinct3192
Distinct (%)53.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean176472.5955
Minimum242
Maximum405547
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2022-01-29T21:33:27.645898image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum242
5-th percentile44005
Q177817
median131093
Q3293381
95-th percentile377245.5
Maximum405547
Range405305
Interquartile range (IQR)215564

Descriptive statistics

Standard deviation116414.4878
Coefficient of variation (CV)0.6596745942
Kurtosis-1.131648746
Mean176472.5955
Median Absolute Deviation (MAD)69479
Skewness0.5926477933
Sum1055835539
Variance1.355233298 × 1010
MonotonicityNot monotonic
2022-01-29T21:33:27.809233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36927228
 
0.5%
5283223
 
0.4%
27568917
 
0.3%
12512316
 
0.3%
9181716
 
0.3%
5889816
 
0.3%
10088914
 
0.2%
11561013
 
0.2%
25049413
 
0.2%
30103212
 
0.2%
Other values (3182)5815
97.2%
ValueCountFrequency (%)
2421
< 0.1%
6411
< 0.1%
13112
< 0.1%
15171
< 0.1%
25332
< 0.1%
32531
< 0.1%
43051
< 0.1%
50561
< 0.1%
51392
< 0.1%
54441
< 0.1%
ValueCountFrequency (%)
4055471
< 0.1%
4053341
< 0.1%
4052331
< 0.1%
4051471
< 0.1%
4047871
< 0.1%
4047861
< 0.1%
4046491
< 0.1%
4039941
< 0.1%
4038331
< 0.1%
4037961
< 0.1%

courier_id
Real number (ℝ≥0)

Distinct578
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102661.6025
Minimum3296
Maximum181543
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2022-01-29T21:33:28.056742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum3296
5-th percentile19432
Q160761
median113364
Q3143807
95-th percentile165848
Maximum181543
Range178247
Interquartile range (IQR)83046

Descriptive statistics

Standard deviation48607.21179
Coefficient of variation (CV)0.4734702225
Kurtosis-1.093560594
Mean102661.6025
Median Absolute Deviation (MAD)39377
Skewness-0.399759354
Sum614224368
Variance2362661038
MonotonicityNot monotonic
2022-01-29T21:33:28.198146image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9921978
 
1.3%
10453376
 
1.3%
14239473
 
1.2%
6641662
 
1.0%
6190061
 
1.0%
3074358
 
1.0%
329657
 
1.0%
3258056
 
0.9%
2096256
 
0.9%
2335954
 
0.9%
Other values (568)5352
89.5%
ValueCountFrequency (%)
329657
1.0%
35922
 
< 0.1%
39411
 
< 0.1%
593519
 
0.3%
64582
 
< 0.1%
67151
 
< 0.1%
687335
0.6%
783328
0.5%
862515
 
0.3%
90786
 
0.1%
ValueCountFrequency (%)
1815433
 
0.1%
1791831
 
< 0.1%
1783258
0.1%
1778477
0.1%
1774851
 
< 0.1%
1772312
 
< 0.1%
1771841
 
< 0.1%
1771251
 
< 0.1%
1767013
 
0.1%
1762501
 
< 0.1%

vehicle_type
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size46.9 KiB
bicycle
4274 
car
1215 
walker
 
274
van
 
76
scooter
 
75
Other values (2)
 
69

Length

Max length10
Median length7
Mean length6.085575798
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowvan
2nd rowbicycle
3rd rowbicycle
4th rowbicycle
5th rowbicycle

Common Values

ValueCountFrequency (%)
bicycle4274
71.4%
car1215
 
20.3%
walker274
 
4.6%
van76
 
1.3%
scooter75
 
1.3%
truck48
 
0.8%
motorcycle21
 
0.4%

Length

2022-01-29T21:33:28.345600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-29T21:33:28.439694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
bicycle4274
71.4%
car1215
 
20.3%
walker274
 
4.6%
van76
 
1.3%
scooter75
 
1.3%
truck48
 
0.8%
motorcycle21
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

pickup_place
Categorical

HIGH CARDINALITY

Distinct898
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size46.9 KiB
Shake Shack
 
311
Momofuku Milk Bar
 
186
The Meatball Shop
 
184
Blue Ribbon Sushi
 
151
sweetgreen
 
149
Other values (893)
5002 

Length

Max length39
Median length15
Mean length15.57295671
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique457 ?
Unique (%)7.6%

Sample

1st rowMelt Shop
2nd rowPrince Street Pizza
3rd rowBareburger
4th rowJuice Press
5th rowBlue Ribbon Sushi

Common Values

ValueCountFrequency (%)
Shake Shack311
 
5.2%
Momofuku Milk Bar186
 
3.1%
The Meatball Shop184
 
3.1%
Blue Ribbon Sushi151
 
2.5%
sweetgreen149
 
2.5%
Blue Ribbon Fried Chicken133
 
2.2%
Whole Foods Market119
 
2.0%
Parm102
 
1.7%
RedFarm Broadway93
 
1.6%
Mighty Quinn's BBQ90
 
1.5%
Other values (888)4465
74.6%

Length

2022-01-29T21:33:28.580201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sushi454
 
3.0%
blue437
 
2.9%
ribbon422
 
2.8%
bar410
 
2.7%
343
 
2.2%
the332
 
2.2%
shack314
 
2.1%
shake312
 
2.0%
momofuku281
 
1.8%
shop234
 
1.5%
Other values (1307)11734
76.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

place_category
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING

Distinct57
Distinct (%)1.1%
Missing883
Missing (%)14.8%
Memory size46.9 KiB
Italian
504 
Burger
454 
Japanese
433 
American
405 
Chinese
332 
Other values (52)
2972 

Length

Max length21
Median length7
Mean length7.396470588
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowAmerican
2nd rowPizza
3rd rowBurger
4th rowJuice Bar
5th rowJapanese

Common Values

ValueCountFrequency (%)
Italian504
 
8.4%
Burger454
 
7.6%
Japanese433
 
7.2%
American405
 
6.8%
Chinese332
 
5.5%
Dessert315
 
5.3%
Sushi253
 
4.2%
Salad206
 
3.4%
Grocery Store187
 
3.1%
Mexican178
 
3.0%
Other values (47)1833
30.6%
(Missing)883
14.8%

Length

2022-01-29T21:33:28.721619image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
italian504
 
8.7%
burger454
 
7.8%
american443
 
7.6%
japanese433
 
7.5%
chinese332
 
5.7%
dessert315
 
5.4%
store315
 
5.4%
sushi253
 
4.4%
salad206
 
3.5%
grocery187
 
3.2%
Other values (56)2366
40.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

item_name
Categorical

HIGH CARDINALITY
MISSING

Distinct2277
Distinct (%)47.9%
Missing1230
Missing (%)20.6%
Memory size46.9 KiB
Fries
 
76
Cheese Fries
 
35
Shackburger
 
31
Chicken
 
30
Shack Burger
 
29
Other values (2272)
4552 

Length

Max length79
Median length15
Mean length16.67220703
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1502 ?
Unique (%)31.6%

Sample

1st rowLemonade
2nd rowNeapolitan Rice Balls
3rd rowBare Sodas
4th rowOMG! My Favorite Juice!
5th rowSpicy Tuna & Tempura Flakes

Common Values

ValueCountFrequency (%)
Fries76
 
1.3%
Cheese Fries35
 
0.6%
Shackburger31
 
0.5%
Chicken30
 
0.5%
Shack Burger29
 
0.5%
Hamburger26
 
0.4%
B'day Cake Truffles26
 
0.4%
Classic Beef24
 
0.4%
ShackBurger24
 
0.4%
Cheeseburger23
 
0.4%
Other values (2267)4429
74.0%
(Missing)1230
 
20.6%

Length

2022-01-29T21:33:28.893011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
chicken335
 
2.6%
319
 
2.5%
spicy196
 
1.5%
salad187
 
1.5%
fries178
 
1.4%
roll138
 
1.1%
chocolate116
 
0.9%
cheese114
 
0.9%
pork114
 
0.9%
tuna108
 
0.8%
Other values (2149)10973
85.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

item_quantity
Real number (ℝ≥0)

MISSING

Distinct11
Distinct (%)0.2%
Missing1230
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean1.248264254
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2022-01-29T21:33:29.018802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum16
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.785903823
Coefficient of variation (CV)0.6295973152
Kurtosis102.754254
Mean1.248264254
Median Absolute Deviation (MAD)0
Skewness7.708421156
Sum5933
Variance0.617644819
MonotonicityNot monotonic
2022-01-29T21:33:29.114247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
13980
66.5%
2570
 
9.5%
3112
 
1.9%
454
 
0.9%
614
 
0.2%
513
 
0.2%
84
 
0.1%
153
 
0.1%
71
 
< 0.1%
121
 
< 0.1%
(Missing)1230
 
20.6%
ValueCountFrequency (%)
13980
66.5%
2570
 
9.5%
3112
 
1.9%
454
 
0.9%
513
 
0.2%
614
 
0.2%
71
 
< 0.1%
84
 
0.1%
121
 
< 0.1%
153
 
0.1%
ValueCountFrequency (%)
161
 
< 0.1%
153
 
0.1%
121
 
< 0.1%
84
 
0.1%
71
 
< 0.1%
614
 
0.2%
513
 
0.2%
454
 
0.9%
3112
 
1.9%
2570
9.5%

item_category_name
Categorical

HIGH CARDINALITY
MISSING

Distinct767
Distinct (%)16.1%
Missing1230
Missing (%)20.6%
Memory size46.9 KiB
Sides
 
193
Burgers
 
148
Appetizers
 
145
Sandwiches
 
123
Fries
 
111
Other values (762)
4033 

Length

Max length53
Median length10
Mean length11.5480749
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique315 ?
Unique (%)6.6%

Sample

1st rowBeverages
2nd rowMunchables
3rd rowDrinks
4th rowCold Pressed Juices
5th rowMaki (Special Rolls)

Common Values

ValueCountFrequency (%)
Sides193
 
3.2%
Burgers148
 
2.5%
Appetizers145
 
2.4%
Sandwiches123
 
2.1%
Fries111
 
1.9%
Salads103
 
1.7%
Cookies86
 
1.4%
Signatures82
 
1.4%
Drinks69
 
1.2%
Naked Balls62
 
1.0%
Other values (757)3631
60.7%
(Missing)1230
 
20.6%

Length

2022-01-29T21:33:29.258639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
381
 
4.4%
sides280
 
3.2%
appetizers197
 
2.3%
rolls196
 
2.2%
sandwiches185
 
2.1%
salads182
 
2.1%
dinner170
 
1.9%
burgers169
 
1.9%
the128
 
1.5%
special122
 
1.4%
Other values (681)6733
77.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

how_long_it_took_to_order
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct2170
Distinct (%)71.4%
Missing2945
Missing (%)49.2%
Memory size46.9 KiB
03:20.2
 
8
04:28.8
 
6
06:36.3
 
6
04:54.4
 
6
04:34.3
 
5
Other values (2165)
3007 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1544 ?
Unique (%)50.8%

Sample

1st row19:58.6
2nd row25:09.1
3rd row06:44.5
4th row03:45.0
5th row07:14.3

Common Values

ValueCountFrequency (%)
03:20.28
 
0.1%
04:28.86
 
0.1%
06:36.36
 
0.1%
04:54.46
 
0.1%
04:34.35
 
0.1%
07:08.85
 
0.1%
03:53.25
 
0.1%
04:45.25
 
0.1%
05:38.25
 
0.1%
05:17.85
 
0.1%
Other values (2160)2982
49.8%
(Missing)2945
49.2%

Length

2022-01-29T21:33:29.397630image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
03:20.28
 
0.3%
04:54.46
 
0.2%
04:28.86
 
0.2%
06:36.36
 
0.2%
05:17.85
 
0.2%
04:34.35
 
0.2%
04:21.45
 
0.2%
05:52.95
 
0.2%
06:31.55
 
0.2%
05:38.25
 
0.2%
Other values (2160)2982
98.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

pickup_lat
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1210
Distinct (%)20.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.74142455
Minimum40.66561086
Maximum40.8180821
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2022-01-29T21:33:29.523011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum40.66561086
5-th percentile40.71527933
Q140.72434
median40.73567676
Q340.7587261
95-th percentile40.7821167
Maximum40.8180821
Range0.15247124
Interquartile range (IQR)0.0343861

Descriptive statistics

Standard deviation0.02283251627
Coefficient of variation (CV)0.000560425084
Kurtosis0.1879280573
Mean40.74142455
Median Absolute Deviation (MAD)0.01265693
Skewness0.4966340253
Sum243755.9431
Variance0.0005213237994
MonotonicityNot monotonic
2022-01-29T21:33:29.660894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.72434159
 
2.7%
40.72610966151
 
2.5%
40.73179547113
 
1.9%
40.71527933109
 
1.8%
40.72301983102
 
1.7%
40.782116793
 
1.6%
40.7450785987
 
1.5%
40.7275189586
 
1.4%
40.729110474
 
1.2%
40.7356767668
 
1.1%
Other values (1200)4941
82.6%
ValueCountFrequency (%)
40.665610865
0.1%
40.671760122
 
< 0.1%
40.673478063
 
0.1%
40.674250959
0.2%
40.674690971
 
< 0.1%
40.6752631
 
< 0.1%
40.67531666
0.1%
40.676105521
 
< 0.1%
40.677683125
0.1%
40.677900772
 
< 0.1%
ValueCountFrequency (%)
40.818082117
0.3%
40.815353971
 
< 0.1%
40.811332251
 
< 0.1%
40.807072341
 
< 0.1%
40.806257211
 
< 0.1%
40.8058391
 
< 0.1%
40.805834931
 
< 0.1%
40.805045181
 
< 0.1%
40.804825291
 
< 0.1%
40.804518531
 
< 0.1%

pickup_lon
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1178
Distinct (%)19.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.98710128
Minimum-74.01583672
Maximum-73.92097967
Zeros0
Zeros (%)0.0%
Negative5983
Negative (%)100.0%
Memory size46.9 KiB
2022-01-29T21:33:29.827250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-74.01583672
5-th percentile-74.0086925
Q1-73.99663005
median-73.98868203
Q3-73.9807391
95-th percentile-73.95509005
Maximum-73.92097967
Range0.09485705
Interquartile range (IQR)0.01589095

Descriptive statistics

Standard deviation0.01489569141
Coefficient of variation (CV)-0.0002013282201
Kurtosis0.4094801605
Mean-73.98710128
Median Absolute Deviation (MAD)0.00794802
Skewness0.7404730568
Sum-442664.8269
Variance0.0002218816226
MonotonicityNot monotonic
2022-01-29T21:33:30.088175image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.99096159
 
2.7%
-74.00249186151
 
2.5%
-73.98567259113
 
1.9%
-74.01486039109
 
1.8%
-73.995854102
 
1.7%
-73.980739193
 
1.6%
-73.9888215187
 
1.5%
-73.988671386
 
1.4%
-73.9842454674
 
1.2%
-73.9938295368
 
1.1%
Other values (1168)4941
82.6%
ValueCountFrequency (%)
-74.015836721
 
< 0.1%
-74.015446453
 
0.1%
-74.01486039109
1.8%
-74.014723441
 
< 0.1%
-74.014625891
 
< 0.1%
-74.013455291
 
< 0.1%
-74.013065095
 
0.1%
-74.0123535515
 
0.3%
-74.011770491
 
< 0.1%
-74.011751032
 
< 0.1%
ValueCountFrequency (%)
-73.920979671
 
< 0.1%
-73.9282871
 
< 0.1%
-73.933240981
 
< 0.1%
-73.9341965
0.1%
-73.935428151
 
< 0.1%
-73.939921
 
< 0.1%
-73.940588852
 
< 0.1%
-73.9409941
 
< 0.1%
-73.942065241
 
< 0.1%
-73.943562364
0.1%

dropoff_lat
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2841
Distinct (%)47.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.74421648
Minimum40.64935581
Maximum40.848324
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.9 KiB
2022-01-29T21:33:30.245370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum40.64935581
5-th percentile40.70855722
Q140.72530703
median40.74042351
Q340.7638848
95-th percentile40.7862259
Maximum40.848324
Range0.19896819
Interquartile range (IQR)0.03857777

Descriptive statistics

Standard deviation0.02525145915
Coefficient of variation (CV)0.0006197556692
Kurtosis0.1228005566
Mean40.74421648
Median Absolute Deviation (MAD)0.01709401
Skewness0.2672815271
Sum243772.6472
Variance0.0006376361892
MonotonicityNot monotonic
2022-01-29T21:33:30.399081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.72740542
 
0.7%
40.732428229
 
0.5%
40.76765128
 
0.5%
40.723676423
 
0.4%
40.723329522
 
0.4%
40.746764422
 
0.4%
40.72445519
 
0.3%
40.720211719
 
0.3%
40.738932318
 
0.3%
40.77802818
 
0.3%
Other values (2831)5743
96.0%
ValueCountFrequency (%)
40.649355812
< 0.1%
40.649403671
 
< 0.1%
40.6495743
0.1%
40.652145192
< 0.1%
40.666888361
 
< 0.1%
40.668071
 
< 0.1%
40.6684811
 
< 0.1%
40.669273111
 
< 0.1%
40.6702831
 
< 0.1%
40.6705841
 
< 0.1%
ValueCountFrequency (%)
40.8483241
 
< 0.1%
40.8374041
 
< 0.1%
40.836029381
 
< 0.1%
40.83513393
0.1%
40.82885471
 
< 0.1%
40.8266581
 
< 0.1%
40.82600541
 
< 0.1%
40.823712931
 
< 0.1%
40.82355292
< 0.1%
40.8200461
 
< 0.1%

dropoff_lon
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2839
Distinct (%)47.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.98576547
Minimum-74.0176786
Maximum-73.92412352
Zeros0
Zeros (%)0.0%
Negative5983
Negative (%)100.0%
Memory size46.9 KiB
2022-01-29T21:33:30.574214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-74.0176786
5-th percentile-74.00964682
Q1-74.000297
median-73.98927961
Q3-73.97469635
95-th percentile-73.95276784
Maximum-73.92412352
Range0.09355508
Interquartile range (IQR)0.02560065

Descriptive statistics

Standard deviation0.01805950314
Coefficient of variation (CV)-0.0002440942933
Kurtosis-0.3696332607
Mean-73.98576547
Median Absolute Deviation (MAD)0.01258029
Skewness0.598600561
Sum-442656.8348
Variance0.0003261456535
MonotonicityNot monotonic
2022-01-29T21:33:30.730897image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.99180242
 
0.7%
-74.008059429
 
0.5%
-73.96687528
 
0.5%
-74.010723523
 
0.4%
-73.990301422
 
0.4%
-73.991501822
 
0.4%
-74.002130819
 
0.3%
-73.95385118
 
0.3%
-74.008823418
 
0.3%
-74.004076117
 
0.3%
Other values (2829)5745
96.0%
ValueCountFrequency (%)
-74.01767863
 
0.1%
-74.01728781
 
< 0.1%
-74.01714941
 
< 0.1%
-74.01712171
 
< 0.1%
-74.017062191
 
< 0.1%
-74.01696471
 
< 0.1%
-74.0169438
0.1%
-74.01688792
 
< 0.1%
-74.0168752
 
< 0.1%
-74.016714671
 
< 0.1%
ValueCountFrequency (%)
-73.924123522
 
< 0.1%
-73.92480611
 
< 0.1%
-73.9258194
0.1%
-73.9277231
 
< 0.1%
-73.92835561
 
< 0.1%
-73.9284994
0.1%
-73.9299311
 
< 0.1%
-73.931111681
 
< 0.1%
-73.93418152
 
< 0.1%
-73.934886
0.1%

when_the_delivery_started
Categorical

HIGH CARDINALITY
UNIFORM

Distinct4867
Distinct (%)81.3%
Missing0
Missing (%)0.0%
Memory size46.9 KiB
40:37.6
 
6
09:25.3
 
5
14:29.8
 
5
47:57.8
 
5
55:46.1
 
5
Other values (4862)
5957 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3951 ?
Unique (%)66.0%

Sample

1st row51:59.9
2nd row58:58.7
3rd row39:52.7
4th row54:11.5
5th row07:18.5

Common Values

ValueCountFrequency (%)
40:37.66
 
0.1%
09:25.35
 
0.1%
14:29.85
 
0.1%
47:57.85
 
0.1%
55:46.15
 
0.1%
46:45.85
 
0.1%
26:28.24
 
0.1%
15:42.84
 
0.1%
56:43.94
 
0.1%
37:08.24
 
0.1%
Other values (4857)5936
99.2%

Length

2022-01-29T21:33:30.871931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
40:37.66
 
0.1%
14:29.85
 
0.1%
47:57.85
 
0.1%
55:46.15
 
0.1%
46:45.85
 
0.1%
09:25.35
 
0.1%
59:43.04
 
0.1%
29:24.14
 
0.1%
31:28.24
 
0.1%
19:55.94
 
0.1%
Other values (4857)5936
99.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

when_the_courier_arrived_at_pickup
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct4446
Distinct (%)81.8%
Missing550
Missing (%)9.2%
Memory size46.9 KiB
03:22.3
 
6
18:24.3
 
5
30:47.0
 
5
06:24.1
 
5
40:36.5
 
5
Other values (4441)
5407 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3638 ?
Unique (%)67.0%

Sample

1st row26:02.1
2nd row37:18.8
3rd row04:17.8
4th row14:42.7
5th row18:50.0

Common Values

ValueCountFrequency (%)
03:22.36
 
0.1%
18:24.35
 
0.1%
30:47.05
 
0.1%
06:24.15
 
0.1%
40:36.55
 
0.1%
01:08.84
 
0.1%
59:48.54
 
0.1%
46:31.84
 
0.1%
14:06.24
 
0.1%
52:15.54
 
0.1%
Other values (4436)5387
90.0%
(Missing)550
 
9.2%

Length

2022-01-29T21:33:30.985839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
03:22.36
 
0.1%
30:47.05
 
0.1%
06:24.15
 
0.1%
40:36.55
 
0.1%
18:24.35
 
0.1%
45:18.34
 
0.1%
00:38.64
 
0.1%
54:01.84
 
0.1%
32:06.74
 
0.1%
15:13.64
 
0.1%
Other values (4436)5387
99.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

when_the_courier_left_pickup
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct4428
Distinct (%)81.5%
Missing550
Missing (%)9.2%
Memory size46.9 KiB
37:43.3
 
7
37:15.6
 
7
15:44.7
 
6
26:51.9
 
5
35:41.9
 
5
Other values (4423)
5403 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3617 ?
Unique (%)66.6%

Sample

1st row48:23.1
2nd row59:10.0
3rd row16:37.9
4th row25:19.4
5th row27:10.6

Common Values

ValueCountFrequency (%)
37:43.37
 
0.1%
37:15.67
 
0.1%
15:44.76
 
0.1%
26:51.95
 
0.1%
35:41.95
 
0.1%
43:52.05
 
0.1%
38:07.45
 
0.1%
50:07.44
 
0.1%
24:51.94
 
0.1%
26:33.74
 
0.1%
Other values (4418)5381
89.9%
(Missing)550
 
9.2%

Length

2022-01-29T21:33:31.081914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
37:43.37
 
0.1%
37:15.67
 
0.1%
15:44.76
 
0.1%
26:51.95
 
0.1%
35:41.95
 
0.1%
43:52.05
 
0.1%
38:07.45
 
0.1%
27:38.14
 
0.1%
40:05.24
 
0.1%
02:29.84
 
0.1%
Other values (4418)5381
99.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

when_the_courier_arrived_at_dropoff
Categorical

HIGH CARDINALITY
UNIFORM

Distinct4849
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Memory size46.9 KiB
35:36.9
 
5
39:07.3
 
5
46:58.5
 
5
41:56.0
 
5
48:41.9
 
5
Other values (4844)
5958 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3932 ?
Unique (%)65.7%

Sample

1st row52:06.3
2nd row59:22.9
3rd row04:40.6
4th row32:38.1
5th row48:27.2

Common Values

ValueCountFrequency (%)
35:36.95
 
0.1%
39:07.35
 
0.1%
46:58.55
 
0.1%
41:56.05
 
0.1%
48:41.95
 
0.1%
26:44.95
 
0.1%
24:46.15
 
0.1%
25:14.64
 
0.1%
30:36.24
 
0.1%
25:49.04
 
0.1%
Other values (4839)5936
99.2%

Length

2022-01-29T21:33:31.191672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
35:36.95
 
0.1%
46:58.55
 
0.1%
41:56.05
 
0.1%
48:41.95
 
0.1%
26:44.95
 
0.1%
24:46.15
 
0.1%
39:07.35
 
0.1%
01:29.04
 
0.1%
57:38.74
 
0.1%
53:41.24
 
0.1%
Other values (4839)5936
99.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-01-29T21:33:24.839135image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:16.185644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.312432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:18.518307image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.814534image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.987438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.378274image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:23.584109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:24.973333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:16.323070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.454994image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:18.659122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.954014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:21.131358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.511998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:23.731078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:25.209604image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:16.466976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.602723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:18.903252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.103208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:21.288358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.667007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:23.893277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:25.352223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:16.606121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.762021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.047745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.250326image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:21.452734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.818023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:24.053151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:25.495538image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:16.745992image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.903687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.197460image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.396086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:21.664732image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.956417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:24.204916image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:25.627121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:16.890296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:18.067460image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.367116image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.539297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:21.824822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:23.112360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:24.372524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:25.785914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.025259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:18.211928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.516725image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.686064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.072245image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:23.251806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:24.523639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:25.942359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:17.174223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:18.368745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:19.670948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:20.841399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:22.232652image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:23.442688image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-29T21:33:24.687448image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-01-29T21:33:31.285803image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-29T21:33:31.457936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-29T21:33:31.645035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-29T21:33:31.802248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-29T21:33:31.943218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-29T21:33:26.178089image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-29T21:33:26.562954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-29T21:33:26.844436image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-29T21:33:27.058782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

delivery_idcustomer_idcourier_idvehicle_typepickup_placeplace_categoryitem_nameitem_quantityitem_category_namehow_long_it_took_to_orderpickup_latpickup_londropoff_latdropoff_lonwhen_the_delivery_startedwhen_the_courier_arrived_at_pickupwhen_the_courier_left_pickupwhen_the_courier_arrived_at_dropoff
01457973327168162381vanMelt ShopAmericanLemonade1.0Beverages19:58.640.744607-73.99074240.752073-73.98537051:59.9NaNNaN52:06.3
1137705664452104533bicyclePrince Street PizzaPizzaNeapolitan Rice Balls3.0Munchables25:09.140.723080-73.99461540.719722-73.99185858:58.726:02.148:23.159:22.9
2147654783095132725bicycleBareburgerBurgerBare Sodas1.0Drinks06:44.540.728478-73.99839240.728606-73.99514339:52.737:18.859:10.004:40.6
31485494271149157175bicycleJuice PressJuice BarOMG! My Favorite Juice!1.0Cold Pressed JuicesNaN40.738868-74.00274740.751257-74.00563454:11.504:17.816:37.932:38.1
41327707122609118095bicycleBlue Ribbon SushiJapaneseSpicy Tuna & Tempura Flakes2.0Maki (Special Rolls)03:45.040.726110-74.00249240.709323-74.01586707:18.514:42.725:19.448:27.2
514231427516991932bicycleTamarind TriBeCaIndianDum Aloo Gobi1.0Vegetarian Specialties07:14.340.719269-74.00875040.725678-74.00061856:36.318:50.027:10.636:53.8
61334106101347124897bicycleThe LoopSushiSpicy Tuna Roll1.0Classic Roll & Hand Roll04:49.440.734858-73.98609340.738368-74.00010508:55.507:16.029:24.540:01.7
713116195916179847bicycleInsomnia CookiesBakeryChocolate Chunk2.0Cookies and BrowniesNaN40.729791-74.00058940.734703-73.99820620:09.317:35.703:24.409:16.6
8148767455375181543bicycleCafe ZaiyaNaNNaNNaNNaNNaN40.729357-73.99015640.719758-73.98501149:48.355:32.401:22.210:44.7
91417206153816157415carShake ShackBurgerShackburger1.0BurgersNaN40.758457-73.98914040.743613-73.97768418:37.920:14.447:03.659:26.1

Last rows

delivery_idcustomer_idcourier_idvehicle_typepickup_placeplace_categoryitem_nameitem_quantityitem_category_namehow_long_it_took_to_orderpickup_latpickup_londropoff_latdropoff_lonwhen_the_delivery_startedwhen_the_courier_arrived_at_pickupwhen_the_courier_left_pickupwhen_the_courier_arrived_at_dropoff
5973137995312710369993bicycleThe Grey Dog - UniversityCoffeeOrganic Raw Veggies1.0Small Plates04:34.440.733787-73.99303540.736760-73.98204835:30.043:14.450:04.657:28.4
59741379770243775138061carPostmates Liquor StoreShopRaventos I Blanc L'Hereu Reserva Brut Cava - 20111.0SparklingNaN40.779598-73.94738840.723861-73.99698108:33.122:28.426:54.057:58.4
59751475459303211156557carBig Nick's Burgers & PizzaPizzaChicken Fingers1.0Appetizers & Side Orders05:11.140.776767-73.97912040.793834-73.94152147:25.948:56.211:46.125:02.1
5976130026613545136664bicycleJuice GenerationJuice BarRed Dragon Fruit1.0SmoothiesNaN40.777598-73.97952840.778493-73.98654248:36.954:59.506:07.413:57.9
59771303444228541148268bicycleRedFarm HudsonChineseSoft & Crunchy Vegetable Fried Rice1.0Rice & Noodles03:32.140.734214-74.00620240.732425-73.99626922:52.535:32.059:22.607:20.0
59781360750378035151467bicycleFive Guys Burgers and FriesBurgerCheeseburger1.0BurgersNaN40.804404-73.96643040.818637-73.93924130:25.641:09.453:28.310:19.3
59791348697969433296bicycleCafe MogadorMiddle EasternVegetarian1.0Cous CousNaN40.727293-73.98451740.725938-73.98055037:42.644:21.354:16.400:43.0
59801274438355090153113bicycleShake ShackBurgerFries1.0FriesNaN40.780826-73.97648340.763573-73.97350312:57.029:59.852:19.816:52.7
59811470282400983142140carOmaiVietnameseCa Tim1.0Appetizers02:17.340.744408-74.00289140.734609-74.00640525:03.622:37.839:25.850:45.3
59821357449128517134189carRedFarm BroadwayChinesePan-Fried Pork Buns (4)1.0Dim Sum09:25.340.782117-73.98073940.732714-73.99504030:44.540:22.506:02.131:54.4

Duplicate rows

Most frequently occurring

delivery_idcustomer_idcourier_idvehicle_typepickup_placeplace_categoryitem_nameitem_quantityitem_category_namehow_long_it_took_to_orderpickup_latpickup_londropoff_latdropoff_lonwhen_the_delivery_startedwhen_the_courier_arrived_at_pickupwhen_the_courier_left_pickupwhen_the_courier_arrived_at_dropoff# duplicates
012747915974961162bicycleBareburgerBurgerBarest Burger1.0Bareburgers04:00.540.768489-73.95518240.784797-73.95365806:44.931:56.037:43.351:34.22
1127999085091105178bicycleThe Meatball ShopItalianIce Cream Sandwich1.0Desserts03:11.440.771491-73.95633440.779800-73.95706155:27.111:42.522:33.346:35.12
2128072812714957496bicycleP.J. Clarke'sAmericanSimply On A Bun1.0P.J. Clarke's Hamburgers16:31.340.758939-73.96853140.766654-73.95284715:18.237:50.850:58.803:28.52
31319722149234139558bicycleMaimonide of BrooklynVegetarianMob Cheeseburger Deluxe1.0Entrees06:24.040.685201-73.98021140.649574-73.96439257:37.708:42.919:32.537:39.52
41341790373689152676bicycleBareburgerBurgerCountry Bacon1.0Bareburgers21:38.740.768489-73.95518240.765698-73.95497559:28.319:47.340:35.946:48.62
5134344917495138057carThe Meatball ShopItalianIce Cream Sandwich1.0Desserts02:26.440.721545-73.98884240.709410-74.00671438:33.249:06.100:26.219:37.72
613497024278165763bicycle2nd Ave DeliDeliPotato Knish2.0Franks and Knishes06:54.440.745299-73.97913840.738913-73.98776052:26.821:08.340:57.149:50.82
7136324410073830905carThe Meatball ShopItalianClassic Beef1.0Naked Balls09:06.440.745963-74.00169740.747098-73.98664144:35.305:45.711:05.524:17.02
81423447391860167596bicycleLucky's Famous BurgersBurgerLucky Shake1.0Shakes10:07.640.723331-73.98953140.720472-73.98932624:39.856:48.503:05.006:07.02
9142827830169562487carsweetgreenSaladHarvest Bowl1.0Signatures06:17.340.745079-73.98882240.750667-73.98609932:27.459:13.609:07.622:30.52